Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validator health status #3207

Open
wants to merge 3 commits into
base: albatross
Choose a base branch
from
Open

Validator health status #3207

wants to merge 3 commits into from

Conversation

viquezclaudio
Copy link
Member

@viquezclaudio viquezclaudio commented Dec 18, 2024

The validator health is used to track the overall validator health based on the number of consecutive deactivations of a validator on a per epoch basis.

The number of consecutive deactivations would cause a delay in terms of when the reactivation transaction is sent.

Pull request checklist

  • All tests pass. The project builds and runs.
  • I have resolved any merge conflicts.
  • I have resolved all clippy and rustfmt warnings.

@viquezclaudio viquezclaudio force-pushed the viquezcl/validator branch 2 times, most recently from f08b8d3 to 8108a5b Compare December 23, 2024 15:18
@viquezclaudio viquezclaudio marked this pull request as ready for review December 26, 2024 17:37
@@ -113,7 +113,7 @@ where
let (v, c) = build_validator(
peer_ids[i],
Address::from(&validator_keys[i]),
false,
true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make automatic_reactivate a fn param for build_validators so it doesn't influence other tests?

/// The current validator health
pub health: ValidatorHealth,
/// Number of blocks that we have produced in time(without being inactivated)
pub blk_cnt: u32,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub blk_cnt: u32,
pub block_counter: u32,

@@ -181,6 +198,8 @@ impl<TValidatorNetwork: ValidatorNetwork + 'static> NextProduceMicroBlockEvent<T
continue;
}

self.health_state.write().blk_cnt += 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.health_state.write().blk_cnt += 1;
self.health_state.write().blk_cnt.saturating_add(1);

Prevent overflow

Comment on lines 168 to 170
log::warn!(block = block.block_number(), "Not publishing block");
let event = ProduceMicroBlockEvent::MicroBlock;
break Some(Some(event));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
log::warn!(block = block.block_number(), "Not publishing block");
let event = ProduceMicroBlockEvent::MicroBlock;
break Some(Some(event));
log::warn!(block = block.block_number(), "Not publishing block");
break Some(Some(ProduceMicroBlockEvent::MicroBlock));

pub health: ValidatorHealth,
/// Number of blocks that we have produced in time(without being inactivated)
pub blk_cnt: u32,
/// For testing/debug purposes control wether produced blocks are published by the validator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// For testing/debug purposes control wether produced blocks are published by the validator
/// For testing/debug purposes control whether produced blocks are published by the validator

pub struct HealthState {
/// The current validator health
pub health: ValidatorHealth,
/// Number of blocks that we have produced in time(without being inactivated)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Number of blocks that we have produced in time(without being inactivated)
/// Number of subsequent blocks that we have been produced without being inactivated

good_blocks = %self.health_state.read().blk_cnt,
"Current validator health is yellow",
);
if self.health_state.read().blk_cnt >= VALIDATOR_HEALTH_THRESHOLD {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic doesn't seem to match with the original description: If the validator is Yellow and is not deactivated in a quarter of an epoch, we change its status to Green.

inactivated = red_block_number,
"Current validator health is red",
);
if self.health_state.read().blk_cnt >= VALIDATOR_HEALTH_THRESHOLD {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic doesn't seem to match with the original description: If the validator is Red and is not deactivated in one epoch, we change its status to Yellow.

Comment on lines 927 to 928
let validator_health = self.health_state.read().health;
match validator_health {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let validator_health = self.health_state.read().health;
match validator_health {
match self.health_state.read().health {

}
ValidatorHealth::Red(_) => {
log::warn!(
"The validator needs human intervention, no automatic reactivate"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"The validator needs human intervention, no automatic reactivate"
"The validator needs human intervention, no automatic reactivation"

@viquezclaudio viquezclaudio added the WIP This pull request is still work in progress label Jan 13, 2025
@viquezclaudio viquezclaudio force-pushed the viquezcl/validator branch 2 times, most recently from 4bfe5c1 to cccd5e5 Compare January 14, 2025 19:44
@viquezclaudio viquezclaudio removed the WIP This pull request is still work in progress label Jan 14, 2025
@viquezclaudio viquezclaudio force-pushed the viquezcl/validator branch 2 times, most recently from 60bc95c to 0341c37 Compare January 28, 2025 21:05
@viquezclaudio viquezclaudio added the WIP This pull request is still work in progress label Jan 28, 2025
@viquezclaudio viquezclaudio force-pushed the viquezcl/validator branch 4 times, most recently from 8fb7bd7 to 28c51c2 Compare January 30, 2025 17:12
@viquezclaudio viquezclaudio removed the WIP This pull request is still work in progress label Jan 30, 2025
}

/// This flag should only be used in tests context.
pub fn _set_publish_flag(&mut self, publish: bool) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method can be annotated with #[cfg(test)] such that it's only included in cargo test and not in cargo build.

@@ -46,6 +46,9 @@ use crate::{
/// in the first place.
/// (**) The validator may be set to automatically reactivate itself upon inactivation.
/// If this setting is not enabled the state change can only be triggered manually.
/// However, there is a delay incurred if the validator is deactivated multiple consecutive times
/// in the current epoch. The delay is function of the number of deactivations and is reduced
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// in the current epoch. The delay is function of the number of deactivations and is reduced
/// in the current epoch. The delay is a function of the number of deactivations and is reduced


use nimiq_keys::Address;

/// Struct that represents the overall health of a validator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ValidatorHealthState is an enum not a struct

address: Address,
/// Only used for testing purposes controls whether blocks are published by the validator
publish: bool,
/// Number of consecutive inactivations that have ocurred in the current epoch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Number of consecutive inactivations that have ocurred in the current epoch
/// Number of consecutive inactivations that have occurred in the current epoch

publish: bool,
/// Number of consecutive inactivations that have ocurred in the current epoch
inactivations: u32,
/// Next block number where the re-activate transaction should be sent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Next block number where the re-activate transaction should be sent
/// Next block number indicating when the re-activate transaction should be sent

@@ -181,6 +196,9 @@ impl<TValidatorNetwork: ValidatorNetwork + 'static> NextProduceMicroBlockEvent<T
continue;
}

// Each successfull block will decrease the number of inactivations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Each successfull block will decrease the number of inactivations
// Each successful block will decrease the number of inactivations

/// Increases the number of consecutive deactivations
pub fn inactivate(&mut self, block_number: u32) {
if !self.pending_reactivate {
self.inactivations += 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

saturating add?

@@ -15,7 +15,7 @@ use nimiq_network_interface::{
};
use nimiq_network_libp2p::Network;
use nimiq_network_mock::{MockHub, MockNetwork};
use nimiq_primitives::{networks::NetworkId, policy::Policy};
use nimiq_primitives::{coin::Coin, networks::NetworkId, policy::Policy};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused imports?

@@ -24,9 +24,11 @@ use nimiq_test_utils::{
},
};
use nimiq_time::{sleep, timeout};
use nimiq_transaction_builder::TransactionBuilder;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

.write()
._set_publish_flag(false);

events.take(30).for_each(|_| future::ready(())).await;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you can take 50 here and get in the red state immediately. Testing if the validator got the yellow status is already covered in another test.

@viquezclaudio viquezclaudio force-pushed the viquezcl/validator branch 2 times, most recently from 8955533 to 090ee97 Compare January 31, 2025 21:42
Track the number of blocks that are succesfully produced when changing the validator health
When a validator is deactivated, the reactivate transaction will have a delay
based on the number of consecutive deactivations the validator has experienced
in the current epoch.
The amount of delayed blocks is of quadratic nature
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants